Efficient Document Clustering via Online Nonnegative Matrix Factorizations

نویسندگان

  • Fei Wang
  • Ping Li
  • Arnd Christian König
چکیده

In recent years, Nonnegative Matrix Factorization (NMF) has received considerable interest from the data mining and information retrieval fields. NMF has been successfully applied in document clustering, image representation, and other domains. This study proposes an online NMF (ONMF) algorithm to efficiently handle very large-scale and/or streaming datasets. Unlike conventional NMF solutions which require the entire data matrix to reside in the memory, our ONMF algorithm proceeds with one data point or one chunk of data points at a time. Experiments with one-pass and multi-pass ONMF on real datasets are presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Subtractive Initialization of Nonnegative Matrix Factorizations for Document Clustering

Nonnegative matrix factorizations (NMF) have recently assumed an important role in several fields, such as pattern recognition, automated image exploitation, data clustering and so on. They represent a peculiar tool adopted to obtain a reduced representation of multivariate data by using additive components only, in order to learn parts-based representations of data. All algorithms for computin...

متن کامل

Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations

Nonnegative matrix factorization (NMF) and its extensions such as Nonnegative Tensor Factorization (NTF) have become prominent techniques for blind sources separation (BSS), analysis of image databases, data mining and other information retrieval and clustering applications. In this paper we propose a family of efficient algorithms for NMF/NTF, as well as sparse nonnegative coding and represent...

متن کامل

Algorithms for Nonnegative Tensor Factorization

Nonnegative Matrix Factorization (NMF) is an efficient technique to approximate a large matrix containing only nonnegative elements as a product of two nonnegative matrices of significantly smaller size. The guaranteed nonnegativity of the factors is a distinctive property that other widely used matrix factorization methods do not have. Matrices can also be seen as second-order tensors. For som...

متن کامل

Nonnegative Matrix Factorization with Orthogonality Constraints

Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, the goal of which is to decompose a data matrix into a product of two factor matrices with all entries in factor matrices restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering), but in some cases NMF produces the results inappropria...

متن کامل

Tensor Decompositions: A New Concept in Brain Data Analysis?

Matrix factorizations and their extensions to tensor factorizations and decompositions have become prominent techniques for linear and multilinear blind source separation (BSS), especially multiway Independent Component Analysis (ICA), Nonnegative Matrix and Tensor Factorization (NMF/NTF), Smooth Component Analysis (SmoCA) and Sparse Component Analysis (SCA). Moreover, tensor decompositions hav...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011